An excess power statistic for detection of burst sources of gravitational radiation 
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We examine the properties of an excess power method to detect gravitational waves in interferometric detector 
data. This method is designed to detect short-duration (< 0.5 s) burst signals of unknown waveform, such as 
those from supernovae or black hole mergers. If only the bursts' duration and frequency band are known, the 
method is an optimal detection strategy in both Bayesian and frequentist senses. It consists of summing the 
data power over the known time interval and frequency band of the burst. If the detector noise is stationary 
and Gaussian, this sum is distributed as a \ 2 (non-central \ 2 ) deviate in the absence (presence) of a signal. 
One can use these distributions to compute frequentist detection thresholds for the measured power. We derive 
the method from Bayesian analyses and show how to compute Bayesian thresholds. More generically, when 
only upper and/or lower bounds on the bursts duration and frequency band are known, one must search for 
excess power in all concordant durations and bands. Two search schemes are presented and their computational 
efficiencies are compared. We find that given reasonable constraints on the effective duration and bandwidth 
of signals, the excess power search can be performed on a single workstation. Furthermore, the method can be 
almost as efficient as matched filtering when a large template bank is required: for Gaussian noise the excess 
power method can detect a source to a distance at least half of the distance detectable by matched filtering if the 
product of duration and bandwidth of the signals is < 100, and to a much greater fraction of the distance when 
the size of the matched filter bank is large. Finally, we derive generalizations of the method to a network of 
several interferometers under the assumption of Gaussian noise. However, further work is required to determine 
the efficiency of the method in the realistic context of a detector network with non-Gaussian noise. 

PACS number(s): 04.80.Nn, 07.05. Kf, 95.55.Ym 



I. INTRODUCTION AND SUMMARY 

A. Background and Motivation 

The inspiral, merger, and ringdown of binary black hole 
systems may be the most important source of gravitational 
radiation for detection by the kilometer-scale interferomet- 
ric gravitational wave detectors such as LIGO and 
VIRGO [§,[§]. The importance of these sources is twofold [f|]: 

1 . A large amount of gravitational radiation is expected to 
be emitted by the merger of two black holes. For in- 
termediate mass (~ IO-Mq-IOOOM©) black hole bina- 
ries this radiation will be in the frequency band of high- 
est sensitivity for LIGO and VIRGO|] These sources 
should therefore be amongst the brightest in the sky, 
and visible to much greater distances than other sources. 
The detection rate for coalescing binary black holes 
could therefore be higher than for any other source. 



* While the relative abundance of such systems is still a very open 
question, we are encouraged by two recent developments in the as- 
trophysics literature: (i) evidence suggesting that black holes in this 
mass range may exist pJfjl] and (ii) a globular cluster model that sug- 
gests LIGO I may expect to see about one black hole coalescence 
event during the first two years of operation Urn. 



2. The radiation emitted from the merger of black holes 
probes the strong field regime of a purely gravitational 
system. This radiation should therefore provide a sen- 
sitive test of general relativity. 

These benefits can only be realized, however, if the gravita- 
tional radiation from black hole mergers can be detected. 

The best understood and most widely developed technique 
for detection of gravitational waves with interferometric de- 
tectors is matched filtering [^Pp. Matched filtering is the op- 
timal technique if the entire waveform to be detected is ac- 
curately known in advance (up to a few unknown parame- 
ters). Unfortunately, the gravitational radiation from black 
hole mergers results from highly non-linear self-interaction of 
the gravitational field. This makes it extremely difficult to ob- 
tain gravitational waveforms. Efforts to do so have met with 
only limited success thus far. Binary black hole mergers will 
therefore not amenable to detection by matched filtering, at 
least for the first gravitational wave searches in the 2002-2004 
time frame. 

Similarly, there are other classes of sources, like core- 
collapse of massive stars in supernovae, or the accretion in- 
duced collapse of white dwarfs, for which the physics is too 
complex to allow computation of detailed gravitational wave- 
forms. For these sources, as for binary black hole mergers, we 
must seek alternative signal detection methods. These meth- 
ods are often called "blind search" methods. 
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One class of search methods is based on time-frequency 
decompositions of detector data. (For an exploration of a 
variety of other methods, see Ref. JTo|].) Time-frequency 
strategies have become standard in many other areas of sig- 
nal analysis [ttji]. There is also a growing literature on 
the application of time-frequency methods to gravitational 
waves [^2[- ^9|JlO| , pOpI| ] . For binary black hole mergers, it is 
possible to make crude estimates of signals' durations and fre- 
quency bands [|J, although these estimates need to be firmed 
up and refined by numerical relativity simulations. This sug- 
gests that one should look only in the relevant time-frequency 
window of the detector output. 

Flanagan and Hughes [Q] (FH) have suggested a particu- 
lar time-frequency method for blind searches. The method 
uses only knowledge of the duration and frequency band of 
the signal: one simply computes the total power within this 
time-frequency window, and repeats for different start times. 
The method detects a signal if there is more power than one 
expects from detector noise alone. Thus, we call it the excess 
power search method. Similar methods have been discussed 
elsewhere in the gravitational wave literature. Schutz J22| ] in- 
vestigated the method in the context of the cross-correlation 
of outputs from different detectors. An autocorrelation fil- 
ter for unrestricted frequencies was published by Arnaud et 
al. JTc| , p3| ] shortly after and independently of FH. A general- 
ization of the excess power filter has also been discussed in the 
signal analysis literature [^4||, where it has recently been "at- 
tracting considerable interest" []25|]. Finally a method closely 
related to the excess power method has recently been explored 
by Sylvestre @. 

The excess power method distinguishes itself for the detec- 
tion of signals of known duration and frequency band by a 
single compelling feature: in the absence of any other knowl- 
edge about the signal, the method can be shown to be optimal. 
Furthermore, it can be shown that for mergers of a sufficiently 
short duration and narrow frequency band, it performs nearly 
as well as matched filtering. 

The essence of the power filter is that one compares the 
power of the data in the estimated frequency band and for 
the estimated duration to the known statistical distribution of 
noise power. It is straightforward to show that if the detector 
output consists solely of stationary Gaussian noise, the power 
in the band will follow a \ 2 distribution with the number of 
degrees of freedom being twice the estimated time-frequency 
volume (i.e., the product of the time duration and the fre- 
quency band of the signal). If a gravitational wave of suf- 
ficiently large amplitude is also present in the detector out- 
put, an excess of power will be observed; in this case, the 
power is distributed as a non-central \ 2 distribution [|7]| with 
non-centrality parameter given by the signal power. The sig- 
nal is detectable if the excess power is much greater than the 
fluctuations in the noise power which scales as the square- 
root of the time-frequency volume. Thus, the viability of the 
excess power method depends on the expected duration and 
bandwidth of the gravitational wave as well as on its intrin- 
sic strength. For instance, the method is not competitive with 
matched filtering in detecting binary neutron star inspirals, 
since the time-frequency volume for such signals is very large, 
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To implement this method, one needs to decide the range 
of frequency bands and durations to search over. For initial 
LIGO, the most sensitive frequency band is ~ 100-300 Hz, 
and it makes sense to search just in this band. For binary black 
hole mergers, signal durations might be of order tens or hun- 
dreds of milliseconds, depending on the black hole masses 
and spins Thus, the time-frequency volume of a merger 
signal can be as large as ~ 100, and its power would need 
to be more than one tenth as large as the noise power for de- 
tectability with the excess power method. 

One can also establish operational lower bounds on the 
time durations and frequency bands of interest. Because the 
largest operational frequency bandwidth is 200 Hz for the ini- 
tial LIGO interferometers, the shortest duration of signal that 
need be considered is 5 ms (for a minimum time-frequency 
volume of unity). Similarly, for a maximum duration of 0.5 s, 
the smallest bandwidth that needs to be considered is 2 Hz. 
The excess power in any of the allowed bandwidths and dura- 
tions can thus be obtained by judiciously summing up power 
that is output from a bank of one hundred 2 Hz band-pass fil- 
ters (spanning the 200 Hz of peak interferometer sensitivity) 
for the required duration. 

Having established the statistic and its operational range of 
parameters, the following simple algorithm for implementing 
the excess power method emerges naturally: 

1. Pick a start time t s , a time duration St (containing N 
data samples), and a frequency band [/ SJ f s + Sf]. 

2. Fast Fourier transform (FFT) the block of (time domain) 
detector data for the chosen duration and start time. 

3. Sum the power in each of the ~ one hundred 2 Hz bands 
spanning the peak sensitivity region of the detector. 

4. Further sum the power in the 2 Hz bands which corre- 
spond to the chosen frequency band. 

5. Calculate the probability of having obtained the 
summed power from Gaussian noise alone using a \ 2 
distribution with 2 x St x Sf degrees of freedom. 

6. If the probability is significant, record a detection. 

7. Repeat the process for all allowable choices of start 
times t s , durations St, starting frequencies f s and band- 
widths Sf. 

This procedure, which must be repeated for every possible 
start time, can lead to moderately-large computational re- 
quirements. We find that the computational efficiency of this 
implementation, which we call the short FFT algorithm, can 
be improved upon by considering data segments much longer 
than the longest signal time duration. In this case, after sum- 
ming over the chosen band, we must FFT the data back into 
the time domain. This implementation, which we call the long 
FFT algorithm, is more efficient by at least a factor of ~ 4 over 
the parameter space of interest. 

The most significant drawback of the filter outlined above is 
that the x 2 statistic is appropriate only to Gaussian noise. Real 
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detector noise will contain significant non-Gaussian com- 
ponents. There are likely to be transient bursts of broad 
band noise that have characteristics very similar to black hole 
merger signals. 

The non-Gaussianity of real detector noise leads us to two 
considerations. First, like most blind search methods, the ex- 
cess power method will likely be a useful tool for character- 
izing and investigating the non-Gaussian components of the 
noise. In particular, it can provide a simple and automated 
procedure for garnering statistical information about noise 
bursts. This is a useful and important feature of the excess 
power method, even though we focus almost exclusively on 
signal detection in this paper. Second, since the method can- 
not distinguish between noise bursts and signals in any one 
detector, it will be essential to use multiple-detector versions 
of the power statistic for actual signal detections. In Sec. |v| 
we derive the optimal multi-detector generalization of the ex- 
cess power statistic under the assumption of Gaussian noise. 
It will be important in the future to generalize this analysis 
to allow for (uncorrelated) non-Gaussian noise components in 
individual detectors. 



The layout of this paper is as follows: in Sec. IB we be 



gin with an overview of the filter and some of its properties. 
This is done with an eye toward implementation, so that read- 
ers whose primary interest is in applying the filter need not 
concern themselves with mathematical aspects of the statisti- 
cal theory of receivers. Subsequently we discuss properties 
of the excess power statistic in Sec. y, its derivation from a 
Bayesian framework in Sec. Q an efficient implementation 
of the statistic in Sec. |y|, and the generalization of the power 
statistic to multiple detectors in Sec. |y|. 



B. Overview 

The output h(t) of the gravitational wave detector is sam- 
pled at a finite rate 1/At to produce a time series hj = 
h(jAt), where j = 0, 1, 2 ... . This output can be written 

as 

hj — fij + Sj (1-1) 

where rij is the detector noise and Sj is a (possibly absent) 
signal. For most of this paper we assume that the noise is 
stationary and Gaussian. Under these assumptions, the com- 
ponents of the Fourier transform of a segment containing N 
samples of noise, 



n k = 



N-l 
3=0 



(1.2) 



can be taken to be independent (we discuss the extent to which 
this is true in Sec. |Tv| ). The one-sided power spectrum Sk of 
the noise is defined by: 

{n k n* k } = \S k , (1.3) 

Here, (•) indicates an average over the noise distribution and 
* denotes complex conjugation. 



Consider a situation where all possible signals have a fixed 
time duration St and are band-limited to a frequency band 
[/a j Is + Sf], but no other information is known about them. 
Then, as we show in Sec. |l| the optimal statistic for detection 
of this class of signals is the excess power statistic 



£ = A \hk\ 2 /S k . 



(1.4) 



The sum in Eq. (1.4) is over the positive frequency compo- 
nents k\ < k < &2 that define the desired frequency band. 
The number of frequency components being summed is there- 
fore equal to the time-frequency volume V = StSf = k^—ki. 

In practice, we search over every time-frequency window 
that is consistent with a range of possible time durations and 
bandwidths. The number of such windows per start time is 



-^windows 



2 -^channels (-^channels 



+ l)Wnax- A^min + l) 



(1-5) 



where A^ max = St mix / At is the number of samples in the 
longest expected signal duration St max , and N m i n = 8t m m/ At. 
Here A^hanneis = ^/max/^/min is the ratio of the largest band- 
width searched over Sf max to the shortest bandwidth <5/ m ; n . 
One strategy for this search was outlined above. A flowchart 
for this algorithm is presented in Fig. [i] We call this algorithm 
the short FFT method. We have also considered a second al- 
gorithm which we call the long FFT method, and its flowchart 
is shown in Fig. |[ 

Under the conditions that (i) the number M of time domain 
data points being filtered is large and (ii) one searches over 
many different time-frequency volumes, the long FFT method 
is computationally more efficient than the short FFT method: 
the long FFT eliminates the redundancy of Fourier transform- 
ing data more than once when it falls into overlapping time- 
frequency volumes. In Sec. IV we estimate the computational 
costs of the two methods. We find that the short FFT method 



requires 



a 



V 2 



short 



2a n 



31og 2 y V 



On 



(1-6) 



floating-point operations per start time where V — 
St m!ix 5 / m ax is the maximum time-frequency volume and 
a max = Sf max At is the maximum dimensionless bandwidth. 
(The frequency band from DC to Nyquist corresponds to 



a max = 4-) The long FFT method requires only 
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(1.7) 



floating-point operations per start time; the computational im- 
provement is a factor of 



a 



short 



i 



1 V 



Gong 2 In 2 a n 



61nU 



(1.8) 



The first term shows that there is at least a factor of ~ 4 to be 
gained by the long FFT method; in addition to this, the com- 
putational gain increases with the total time-frequency volume 
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to be searched. For V = 100, the value of the second term is 
also ~ 4. 

Having computed the total power for the various time- 
frequency windows of interest, one must decide which, if any, 
of those windows might contain a signal. A signal increases 
the expected power in a window, so we seek windows contain- 
ing a statistically significant excess of power. If one knows the 
distribution of the noise data, one can derive the distribution 
for the noise power and thus set detection thresholds on the 
power. 

The power £ is distributed as a \ 2 distribution with 2V 
degrees of freedom in the absence of a signal. When a signal 
is present, £ is distributed as a non-central \ 2 distribution with 
2V degrees of freedom p7|], where the non-central parameter 
is the signal power A 2 : 

A 2 -4 ^ \h\ 2 /S k . (1.9) 

k± </c</C2 

The quantity A also represents the signal-to-noise ratio that 
one would expect to achieve if a matched filter were used to 
detect the signal. 

In Fig. |3| we show the central and non-central x 2 distribu- 
tions for several choices of parameters. It is straightforward to 
use the \ 2 distributions plotted in Fig. |3| (a) to set a frequen- 
tist threshold for the excess power statistic so that a desired 
false alarm probability is achieved; then the false dismissal 
probability can be computed as a function of signal amplitude 
using the non-central \ 2 distributions plotted in Fig. || (b). 
Alternatively, on can fix both false alarm and false dismissal 
probabilities, and then use the \ 2 an d non-central \ 2 distri- 
butions to determine the expected signal-to-noise ratio (A) of 
the signal which achieves these probabilities. A curve demon- 
strating this for a false alarm probability of 10~ 9 and and false 
dismissal probability of 0.01 is show in Fig. ||(c). 

A derivation of the statistical properties of the excess power 
statistic (both with and without a signal) can be found in 
Sec. In Sec. |m| we show that the method is optimal and 
unique under suitable assumptions using a Bayesian analysis. 

The excess power statistic requires minimal information 
about the signals to be detected, making it a useful statistic 
for poorly modeled sources. Nevertheless, it is useful to com- 
pare the detection efficiency of the method to that of matched 
filtering. We characterize the excess power filter by the time- 
frequency volume V and the matched filter bank by the num- 
ber A^ff of effectively independent filters. For given false 
alarm and false dismissal probabilities, we denote the ampli- 
tude of the weakest signals detectable using the excess power 
filter or a bank of matched filters by A^ a and Ajjj? respec- 
tively. The relative effectiveness rj of the two search methods 
is given by the ratio of these amplitudes: r] = A^ in /A^f n . In 
other words, the excess power statistic can detect a source at 
a fraction r) of the distance to which a bank of matched filters 
can. The relative effectiveness t] is plotted as a function of the 
time-frequency volume V and the number of templates Af e g 
is shown in Fig. |j. This figure shows that for time frequency 
volumes V less than ~ 100, and for all values of the effec- 
tive number of templates A4ff, the relative effectiveness rj is 
greater than 1/2. 



II. THE SEARCH METHOD IN A SINGLE 
INTERFEROMETER 

In this section we define the search method in the context of 
a single interferometer, and derive its operating characteristics 
from the frequentist statistical framework. 

A. Definition of method for a single time-frequency window 

Consider stretches of discretely sampled detector data h = 
{ho, hi, . . . , ft.jv-i} consisting of N data points. We will de- 
note by V the A^-dimensional vector space of all such data 
stretches. We assume that the detector output consists of a 
stationary, zero-mean, Gaussian noise component rij, plus a 
possible signal sj, so that hj = rij + sj. Under these assump- 
tions, the statistical properties of the noise are characterized 
by the N x N correlation matrix 

R ij = {n i n j ) = C n [\i-j\At). (2.1) 

Here C„(<) is the correlation function of the noise and At 
is the sampling time. This correlation matrix determines a 
natural inner product on V given by 

JV-l 

(a,b) = ^2 aiQijbj. (2.2) 
id=o 

where Q = R _1 . 

We now discuss the notion of time-frequency projections. 
Consider the time-frequency window 

T = {t s ,St,f s ,Sf} (2.3) 

defined by the frequency interval [/ s , f s + Sf], where f s is a 
starting frequency and Sf is a bandwidth, and the time inter- 
val [t s ,t s + St], where t s is a starting time and St is a dura- 
tion. Suppose that we want to focus attention on that portion 
of the data that lies inside the time-frequency window T, to 
the extent that this is meaningful. One obvious thing to do 
is to truncate the data in the time domain, perform a discrete 
Fourier transform into the frequency domain, and then throw 
away the data points outside the frequency band of interest. 
One then obtains the quantities 

N t -1 

H K = £ e 2 ™ JK / N *h j+J , (2.4) 

.7=0 

where j = t s /At, N t — St/ At, and K runs over the range 
f s 8t < K < (f s + Sf)St. We denote by Wj the vector space 
of the projected data Hk- The dimension of this vector space 
over the real numbers is 

dim W T = 2St5f = 2V T , (2.5) 

where Vr = StSf is the time-frequency volume of the time- 
frequency window T. 
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Of course, there are many other methods of attempting to 
pick out the portion of the data in the time-frequency win- 
dow T.0 The lack of a preferred unique method is due to the 
uncertainty principle. In many circumstances the differences 
between different reasonable choices will be relatively unim- 
portant. For the remainder of this section we will assume that 
we have picked some reasonable projection method. | We can 
write the projected data in general as 



N-l 

E 

3=0 



Aj hj, 



(2.6) 



where A j is a real 2Vr x N matrix, the quantities hj are real, 
and J runs over < J < 2Vq — 1. 

We define the power statistic associated with T and with a 
choice of projection method to be 



2V-T-1 



£ T (h)= Y, Quhh 



i,j=o 



where J2j QijRjk = $i K and 



Rjk = (hjh K ) = A J A RR]k 



(2.7) 



(2.8) 



is the correlation matrix of the projected data. The quantity 
£ is, roughly speaking, just the total power in the data stream 
within the given time-frequency window, where power is not 
the physical power but is measured relative to the detector 
noise (i.e., it is the conventional power of the pre-whitened 
data stream). 

The statistic can also be described geometrically as fol- 
lows. The linear mapping defined by Eq. ( |2.6| ) has a kernel 
{hi \ hi = for all /}. The set of all vectors h perpendicular 
t o al l elements of this kernel with respect to the inner product 
( |2.2| ) form a subspace VV of V which can be naturally identi- 
fied with VVt- Any element h in V can be decomposed as 



■II 



(2.9) 



where h|| lies in Vr and hj^ is perpendicular to all elements 
of Vr- The statistic ( |2.7| ) is the squared norm of the parallel 
component: 



£r(h) = (h| h h| U 



(2.10) 



For the simple time-frequency truncation method (2.4) dis 



cussed above, one can obtain a simple approximate formula 



^For example, one could FFT the entire data segment, truncate it 
in the frequency domain, FFT back to the time domain, and then 
truncate it again in the time domain. 

*The choice of a projection method corresponds mathematically to 
the choice of a 2Vt -dimensional subspace of the dual space V* of 
V. When one specifies in addition the detector noise spectrum, the 
projection method determines a 2 Vr dimensional subspace of V. 



for the stat istic. The correlation matrix of the quantities Hj 
of Eq. (24) is given by, to a first, crude approximation, 



(Hj H K ) = 0, 

{HjH* K ) — \8jk Sk, 
where < J, K < N t /2, 

S K = StS h {K/St)/(At) 2 



(2.11) 
(2.12) 

(2.13) 



and Sh(f) is the conventional one-sided power spectral den- 
sity of the detector noise. The expressions (2.11) and (2.12) 



are accurate only when f s St 1 and when Sh(f) does not 
vary substantially on scales ~ 1/St; more a ccur ate expres- 
sions can be c omp uted if desired from Eqs. (2.1) and (2.4). 
The definition (2.7) now yields 



(f,+Sf)St 

£r(h)«4 Y \Hk\ 2 /S k , 

K=f s St 



(2.14) 



cf., Eq. (1.4) of the Intr oduc tion. We show in Sec. IV be 



low that the expression ( 2.14 ) is an adequate approximation 
to £r(h) for most purposes. 

The search method consists of searching over time- 
frequency windows T, and selecting as possible events only 
those windows T for which £r exceeds a suitable threshold 
£*. We discuss further how to search over time-frequency 
windows in Sec. IV below. For the remainder of this section 



we assume that the time-frequency window T is fixed and 
known, and discuss the performance of the statistic. 



B. Operating characteristics of the statistic 

When a signal is not present in the data stream, the statistic 
£ = £r(h) is the sum of the squares of 2V independent, 
zero-mean, unit-variance Gaussian random variables.^ Thus 
£ follows a x 2 distribution with 2V degrees of freedom; the 
upper-tail cumulative probability is 



Q {£*) = P(£ > £*) 



T(V,£*/2) 
T(V) ' 



(2.15) 



where T(a, x) — f°° e~'f a_1 c?i is the incomplete Gamma 
function. The quantity Qo(£*) is the false alarm probabil- 
ity for the detection threshold £*. The distribution ( 2.15 ) is 
plotted in Fig. ^ for several values of V. An approximate ex- 
pression for Qq in the regime Qo <C 1 is QZffl 



Me*) 



A 2 
2V 



1 + 



O 



1 

aJ 



o 



At) 



(2.16) 



§To see this, note that there is a basis of Wr in which the correla- 
tion matrix Rjk is equal to the 2V x 2V identity matrix, and use 
Eqs. and (Q. 
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where A 2 =£*- 2V. 

We next consider the case when a signal is present, so tha t 
h = n + s. The formula for the statistic £ given by Eqs. (2.6) 
and (0) becomes 



2V-1 

E 

J, .7=0 



Qu{ni + s/)(nj + sj), 



(2.17) 



where rij = J2j^i n j i s tne projected noise and sj — 
^jAjSj is the projected signal. We define the amplitude 
A of the signal byQ 



A 1 



2V-1 

E 



QljSlSj, 



(2.18) 



where we use the notation of Eq. (2.9). The expression (2.17) 
can be simplified by (i) choosing the basis of Wr so that 
Qu = 5u (which is roughly equivalent to whitening the de- 
tector output for a large class of time-frequency windows) and 
(ii) further specializing the choice of basis so that the signal 
vector is (sq, si, . . . , s-rv-i) — {A, 0, . . . ,0). The result is 



£ = (n + A) 2 + 



2V-1 

i=i 



(2.19) 



where no, n\, . . , , n^v-i are independent Gaussian random 
variables with zero mean and unit variance. 



From Eq. ( 2.19 ) one can compute the moment generating 
function for the random variable £. The result is 



t£ _ exp[AH/(l - 2t)} 
' ' {l-2t) v 



(2.20) 



The probability distribution for £ can be now obtained by tak- 
ing the inverse Laplace transform, which is accom plishe d by 
expanding the argument of the exponential in Eq. ( 2.2C ) as a 
power series in A. The result is a weighted sum of ^proba- 
bility distribution functions: 



p(£\A,V) = J2 



-A 2 /2 



{A 2 /2) r 



-S/2 



{£12) 



n+V-l 



n=0 



T(n + V) 



(2.21) 



This is the non-central x 2 probability distribution with non- 
centrality parameter A 2 discussed in Sec. 26.4 of Ref. ]^t|]. A 
closed form expression for the probability distribution is [[28]] 



p(£\A,V) 



l e -{£+A 2 )/2 ( 



(s^/Ay-'iy 



v-i-, 



(2.22) 



"Note that A is the signal-to-noise ratio that would be obtained by 
matched filtering if prior knowledge of the waveform shape allowed 
one to perform matched filtering, and if the signal s were confined to 
the time-frequency window T. 



where /„ (x) is the modified Bessel function of the first kind 
of order n. 

The upper-tail cumulative probability distribution for £ 

/>oo 

Qa(£*, A V) = P(£ > £*\A, V) = / P (£\A,V)d£ 

J £* (2.23) 

is the true detection probability for a given threshold £* and a 
given signal amplitude A. Figure ^ shows this true detection 
probability Qa, expressed as a function of sig nal strength A 
and false alarm probability Qq [via Eq. (2.15)], evaluated at 
Qa = 0.01, for several different values of the time-frequency 
volume V. 

Some qualitative insight into the detectability of a signal 
can be obtained from the first two moments of the distribution 
for £. The expected (mean) value is (£) = 2V + A 2 , while 
the variance is Var£ = (£ 2 ) - (£) 2 = AV + AA 2 . For large 
values of V the probability distributions are nearly Gaussian 
within a few sigma of the expected value, so we can imagine 
setting the threshold £* to be a few times v 'AV above the 
mean noise level 2V in order to achieve the required false 
alarm probability. Thus, a signal will be detectable when £ — 
2V > (a few) x \[W . In other words, the signal power A 2 
can be small compared to the mean noise power 2V and still 
be detectable; it need only be comparable to the much smaller 
fluctuations ~ V AV in the noise power. 

Once one specifies the time-frequency volume V and de- 
sired values of the false alarm probability Qq and true detec- 
tion probabilities Qa, there is a minimum signal amplitude 
A m i n that can be detected with the excess po wer m ethod. To 
compute this amplitude, one first inverts Eq. (2.15) to obtain 
the required threshold £* as a functi on of Qq and V: £* = 
£*{Qo, V). Second, one inverts Eq. ( 2.23| ) to obtain the am- 
plitude A as a function of £*, Qa and V: A = A(£* , Q A , V). 
The minimum signal amplitude is then given by 



4„in(Qo, Qa, V) = A[£*(Q , V), Q A , V}. 



(2.24) 



This quantity is plotted as a function of the time frequency 
volume in Fig. || (c), for various values of Qq and for Qa = 
0.99. 



C. Comparison of performance to that of matched filtering. 

The performance of the excess power statistic should be 
compared with that of a matched filtering search for the same 
class of signals. Of course, matched filtering searches will not 
be possible for the classes of signals we are interested in (for 
example supernovae) due to the lack of theoretical templates; 
nevertheless the comparison is useful as benchmark of the ex- 
cess power method. 

We start by discussing how the performance of matched 
filtering depends on the set of signals being searched for. 
Suppose that a given class of signals have a known duration 
and frequency band, so that they all lie inside a fixed time- 
frequency window T with time-frequency volume V. Let 



6 



At(Qo) be the signal-to-noise ratio threshold for the matched 
filtering search necessary to give a false alarm probability of 
Qo for each starting time. Now if the bank of matched fil- 
ters consisted of M statistically independent filters, then the 
threshold A+ would be given by the formula 



erfc(A*/V2) = 1 



(1-Qo) 1/A "«0o/AA. 



(2.25) 



In the more realistic case where the templates are not all sta 



tistically independent, the formula ( 2.25 ) motivates us to de- 
fine an effective number of independent templates A/" e ff = 
■/Veff(Qo) by the relation^ 



rfc[A*(Q )/V2| = Qo/K«(Qo) 



(2.26) 



This effective number of templates depends on the false alarm 
probability Qo, or, equivalently, on the detection threshold 
via the relation A+ = A+(Qo). For a given class of signals 
(e.g., inspiralling binaries), it should be possible to estimate 
A/'eff by a modification of the method of Ref. [|9|] wherein 
one does not eliminate the intrinsic parameters or the signal 
amplitude and one chooses a minimal match of order ~ 0.3 
say to give approximately statistically independent templates. 
We suspect that A4ft will not differ too significantly from the 
actual number of templates used in a search.^ In any case, 
for a given matched filtering search, the detection threshold 
and the resulting effective number of independent filters will 
be determined by Monte Carlo simulations. 

An illustrative special case of matched filtering is when the 
signal manifold is a linear subspace S of the space of all pos- 
sible signals. In this case the maximum over all templates of 
the signal-to-noise ratio squared is simply (hy , h|| ) , where hy 
is the perpendicular projection of the detector output h into S 
[pOP. Since this quantity has a x 2 distribution, it follows from 



the approximate formula (2.16) and from the definition (2.26) 



that for this special case we have 



Kff(A+) 



A, 
1 + 



d 



O 



d/2 



o 



Ai 



O 



At 



(2.27) 



where d = dim(iS) is the number of signal parameters. In Ref. 
101 this relation was used to define an effective dimension for 



^In Ref. [^] the quantities A±, Qo and A4ff were denoted p*, 
e/A/" s tan timc S , and Wshapes, respectively. 

M The grid of search templates used will be determined by having a 
minimal match | ^ | of ~ 0.97 say instead of ~ 0.3, which tends to 
make the actual number of templates larger than A4ff. On the other 
hand, a template grid needs only 2 templates to cover all possible 
signal amplitudes and phases (when the other parameters are fixed), 
whereas the number of statistically independent templates that can be 
generated by varying the amplitude and phase can be much greater 
than 2 for small Qo- 



any space of signals, and that effective dimension was used in- 
stead of A/'eff (Qo) to characterize the signal space. Here how- 
ever we will instead parameterize our comparisons directly in 
terms of A/" e ff. 

We also note that for this special case of a linear signal sub- 
space, we have 



Af eB (A i ,)^2 I -^\ 



(2.28) 



where is the number of bits of information about the 

source carried by a detected signal with signal-to-noise ratio 
A+, as defined in Ref. [|30|]. We conjecture that this relation 
might be approximately valid for general signal manifolds. 

We now turn to the relative performance of the excess 
power and matched filtering search methods. As explained in 
the previous subsection, once we specify the false alarm prob- 
ability Qo and true detection probability Qa we can compute 
the minimum signal amplitude A^ necessary for detection 
via the excess power method, as a function of Q , Qa and V. 



Let us denote that value here as A^ n . 



A 



EP 



Anin(Qo, Qa, V). 



(2.29) 



Similarly, we can compute the minimum amplitude neces- 
sary for detection with matched filtering with A4ff indepen- 
dent templates; the result is 



-4 



MF 



A^(Qo/AU<2a,1/2)- 



(2.30) 



In other words, one simply uses the same formula with Qo 
replaced by Qo/Af e s, and with the number of degrees of free- 
dom 2V being unity. We define the relative effectiveness rj of 
the excess power method relative to the bank of filters by 



v(Qo,Qa,Mb,V) = A%J4 



iMF 



(2.31) 



The factor by which the event rate for the excess power 
method is smaller than that for the bank of matched filters 
3 The relative effectiveness is plotted in Fig. ^ as a func- 



1S 7/° 



tion of V and A4ff. For V < 100 we see that rj > 0.6 al- 
ways, showing that the excess power method performs almost 
as well as matched filtering. 

When the time-frequency window T is not known in ad- 
vance, one must search over time frequency windows. This 
reduces the efficiency of the excess power method compared 
to matched filtering. An approximate parameterization of this 
reduction can be obtained by replacing in Eq. (2.29) the false 
alarm probability Qo by Qo/A/"„, where A/"„ is the number 
of statistically independent time frequency windows searched 
over per start time. We have performed Monte Carlo simu- 
lations with white Gaussian noise which suggest that JV W < 
100V^ ax , where V max is the largest time-frequency volume 
searched over. The resulting change in the relative efficiency 
7] is not very large. 

Further insight into the relation between the excess power 
and matched filtering methods can be obtained as f ollows. 
First, an approximate formula for the function (2.24) is ob- 
tained by approximating the distribution of £ to be a Gaus- 
sian: 
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A min {Q , QA, V) 2 = A*(Q , V) 2 

+2\/2err 1 (2Q J 4 - 1)^V + A±{Q ,V) 2 , (2.32) 



where A iI (Qo, V) is obtained by inverting Eq. (2.16). This 



formula is typically accurate to a few percent for 0.5 < Qj 



< 

0.99. When Qq <C 1 — Qa (which will t ypica lly be the case), 
we can neglect the second term in Eq. (2.32) in comparison 
with the first, so that 



^L(Qo,Qa,V)*sMQo,V). 



(2.33) 



Second, the quantity A^(Qq,Qa) will similarly depend 
weakly on Qa and will be well approximated by the quan- 
tity A*(Qo) obtained from Eq. (]2.26[ ). Combining these ap- 
proximations together with Eq. ( |2.16| ), we see that the excess 
power method is equivalent (in terms of detection thresholds) 
to matched filtering with an effective number of templates of 



A'e 



eff 



3. 

2V 



A, 



2V 



(2.34) 



The quantity (2.34) was shown in Ref. pOj ] to be the to- 
tal number of distinguishable signals within the given time- 
frequency window with signal-to-noise < A*. In other words, 
it is the maximum possible value for the effective number of 
independent templates Af e s for any manifold of signals inside 
the time-frequency window Vt, as a function of the time- 
frequency volume V. Hence, we can understand the excess 
power method as a limiting case of matched filtering: the case 
where the manifold of signals becomes so large (perhaps curv- 
ing back and intersecting itself) that (when smeared out by 
the noise) it effectively fills up the entire space Vt of signals 
within the given time-frequency window. The equivalence can 
also be seen from the fact, noted above, that the excess power 
statistic coincides with the matched filtering statistic when de- 
pendence of the signal on its parameters is linear and the num- 
ber of parameters coincides with the dimension 2V of Wt 

s 



III. BAYESIAN ANALYSIS OF SIGNAL DETECTION 

In this section we show how our proposed search method 
arises naturally from an analysis of th e dete ction of signals 
from a Bayesian point of view. Section [II A defines the class 
of signals under consideration i n term s of a prior probability 
density function ( PDF) . Section [II B derives the excess power 
statistic, and Sec. [II C compares detection criteria based on a 
false alarm rate to criteria based on the probability that a signal 
is present in the data. 



PDF p(s) for signals s in the vector space V. In this sub- 
section we explain how to encode knowledge of the expected 
bandwidth and duration of the signals in the PDF. 

Suppose we know that the signal s lies approximately 
within some time-frequency window T — [t s ,t s + St] x 
[fs,f s + Sf], but that nothing else is known about the sig- 
nal. Then we know that s belongs to a subspace Vt of V. 
Of course, there are several slightly inequ ivalent choices of 
such a subspace, as discussed in Sec. II A above, but we will 



assume that these differences are unimportant, and pick one 
choice of Vt- 

For any vector h in V, we can write h = h|| + h^, where 
h|| is the projection of h into Vt and hj^ is perpendicular to 
Vt- We assume the following form for the prior PDF p(s\T) 
given the time-frequency window T: 



p(s\T)d N s = S( N ~ 2V \s ± )S N - 2V *> 



x Pl (A)d 2y Sll 



(3.1) 



where A 2 — (sm , sn ) and N is the dimensions of V. Here the 
first factor consisting of the (N — 2V A )-dimensional 5 function 
restricts s to lie in Vt- The second factor depends only on the 
magnitude A of Sn, which means that we assume all directions 
in the vector space Vt are equally likely when one measures 
lengths and angles wi th t he inner product (|2~2|).p| We can 
rewrite the prior PDF (3. 1 ) as 



p(s\T) d N s = S^ N - 2V \s ± ) S N - 2V ~>s ± 

x^±d( 2V -i%p(A)dA (3.2) 

where d^^^Q^ is the (2V — l)-dimensional element of 
solid angle and where p(A)dA is the probability that the sig- 
nal amplitude lies bet ween A and A + dA. We discuss the 
choice of p(A) in Sec. [II C below. 

So far we have assumed that the time interval [t s , t s + St] 
and frequency interval [f s ,f s + Sf] are known. In a real 
search, however, one must account for ignorance of these pa- 
rameters. An appropriate prior which does this is 



p(s) d N s = p(s\T) d iy s pr(T)d*T, 



A' 



(3.3) 



where pt{T) = PT(t s , St, f s , Sf) is a prior PDF on the time- 
frequency window parameters. The PDF pt{T) should be 
uniform in t s , but its dependence on the parameters St, f s and 
Sf will depend on the class of sources under consideration. 
We will see below that our analysis depends only weakly on 
the choice of PDF pt{T), as long as it is a slowly varying 
function of its parameters. 



A. The space of signals 

The signals of interest (e.g., black hole mergers) are poorly 
understood. We characterize our knowledge in terms of a prior 



"It would be more realistic to make this assumption with respect to 
an inner product on V whose definition did not depend on the noise 
spectrum, but if the noise spectrum does not vary too rapidly within 
the bandwidth of interest, the distinction is not too important and our 
assumption will be fairly realistic. 
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B. Derivation of the search method 

In the Bayesian approach to signal detection there is a 
unique and optimal method to search the data stream for sig- 
nals if the statistical properties of the detector noise and the 
prior probability distribution for signals are known. One com- 
putes the probability p s (h) that some signal s is present in 
the measured data h. The signal-detection criterion is that the 
probability p s (h) exceeds some threshold value. This is the 
starting point for our analysis; more details can be found in 
Wainstein and Zubakov [ f3l| ] or F inn a nd C hernoff [ |32|]3~3| ] . 

The prior PDF given Eqs. (3.2) and (3.3) describes our state 



of knowledge about the signals to be searched for. Now let p s o 
denote the a priori probability that gravitational waves exist 
(or that our signal model is correct), for which an appropriate 
value for the first searches might be p s o — 1/2. The signal 
PDF (3.3) then gets modified to 



(l-p s0 )5"(s)d"s+p s0 p(s)d 



(3.4) 



It then follows that the posterior probability p s (h) that a signal 
is present in the data h is given by 



A(h)- 

where the likelihood function A(h) is 
A(h) = J A(h;s)p(s)d N s 



PsO 



PsO 



(3.5) 



Jj A(h;s) p(s\T)p(T)d N sd 4 T (3.6) 



and 



A(h;s) 



p(h\s = 0) 



(3.7) 



InEq. (3.7)the quantity p(h|s) is the probability of measuring 
the time series h when the signal s is present, and p(h\s = 0) 
is the corresponding probability when no signal is present. For 
stationary Gaussian noise the likelihood ratio A(h; s) is [p2|], 

A(h; s) = exp[(h, s) - (s, s)/2]. (3.8) 

Equation ( |3.5[ ) shows that the probability p s (h) increases 
monotonically with increasing A(h). Consequently, thresh- 
olding on A(h) to detect signals is equivalent to threshold- 
ing on the probability that a signal is present in the data 
stream. This is also the optimal signal detection strategy in 
the Neyman-Pearson sense of maximizing the detection prob- 
ability for a fixed false alarm probability [ pl| , p4[ ]. 

The integral (3^) includes a integral over all possible time- 
frequency windows T, which can be approximated as a sum: 

A(h)«— i— £ /A(h;s)p(s|T)^s. 

-^windows ^_ J (3 9) 

Here Addows is the number of grid points in a grid on the 
four dimensional space of time-frequency windows used to 



approximate the in tegral. Now if a signal is present, the sum- 
mand in Eq. ( 3.1C ) will be a sharply peaked function of T. If 
the grid spacing is chosen to approximately coincide with the 
width of the peak (so that A/" w i n dows is the number of statisti- 
cally independent time-frequency windows) then the sum will 
be dominated by the largest term, and we obtain 



A(h) 



1 

77 max 

•Al windows ^~ 



J A(h;s)p(s|T) d N s. 



(3.10) 



Thus it is sufficient to consider only a single time-frequency 
region in the remainder of this section with the understanding 
that the signal detection will be based on the maximum of the 
likelihood function over all relevant time-frequency windows. 
Also we can factor A/"„i n dows as A/"„i„dows = Mt A/" w , where Af sX 
is the number of statistically independent starting times t s in 
the search, and A/"„ is the number of statistically independent 
time-frequency windows per start time. 



The evaluation of the integral in Eq. ( 3.10 ) with the prior 
PDF given by Eq. (3.2) can be done in stages. First we in- 
tegrate over the delta-function to restrict the possible signals 
to the vector space Vr- This essentia lly re places s by sn in 
Eq. (3.8). Next we use the definition (2.10) of £ to write the 
inner product appearing in Eq. (3.8) as 



(h, 



A£ 1/2 cos9, 



(3.11) 



where 9 is the angle between the vectors hii and Sii . We then 
obtain the formula 



A(h) = 



1 



AC AC 



A(h; A) p(A) dA 



(3.12) 



with 



A(h;A) = 



T(V), 



-A 2 /2 



.AS 1 



-cose sin 2V-2 ( 



cW 



7rV2r(V-l/2), 
T(V)e- A2 / 2 (A£ 1 ^/2) 1 - v I v ^ 1 (A£ 1 / 2 ) 
p(8\A,V)/p{8\A = 0,V) (3.13) 



[cf. Eq. (2.22)]. Here I n (x) is the modified Bessel function 
of the first kind of order n, and maximization over time fre- 
quency windows T is understood. 

The quantity (3.13) is a monotonically increasing function 
of the power £. Hence thresholding on A is equivalent to 
thresholding on £, and so £ is the optimal (in the Bayesian 
sense) statistic for the detection of the class of signals we have 
considered. 



C. Bayesian thresholds 

Frequentist detection thresholds £* are set by speci fying a 
false alarm rate, and can be computed using Eq. ( 2.15 ). As is 
well known, if such a threshold is exceeded it does not neces- 
sarily mean that a signal is present with high probability, even 
for low false alarm probabilities ||35|-|37|] . To determine how 
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likely it is that a signal is actually present in the data stream 
requires the use of Bayesian methods. 

For a Bayesian detection strategy, one sets a threshold on 
the posterior probability p s (h) that a signal is present given 
the data. This probability is related to the li keliho od function 
A(h) by Eq. (gj). In general, the integral ( |3.12| ) required to 
compute A(h) must be performed numerically. Since A(h) 
depends on the data h only through £(h), one can determine 
a Bayesian threshold for £ from the value of p s . 

Consider a search characterized by Af st statistically inde- 
pendent start times and J\f w statistically independent time- 
frequency windows. The frequentist false alarm probability 
Qo of the previous section [Eq. (2. 15)] is the false alarm prob- 



ability for a given start time and a given time-frequency win- 
dow. Hence the false alarm probability for the entire search 



(3.14) 



It is natural, in comparing frequentist and Bayesian thresh- 
olds, to set p s — 1 — pf a . For example, for "99% confidence" 
one would choose p s — 0.99 = 1 — pt- d . We emphasize that 
this means "99% confidence that events will be due to signals" 
for the Bayesian, while it means "99% confidence that there 
will be no false events" for the frequentist; since these are dif- 
ferent statistical statements, the frequentist and the Bayesian 
will obtain different thresholds. 

We first discuss a pprox imate evaluation of Bayesian thresh- 
olds. The integral ( |3.12 ) can be approximately evaluated in 
the the regime V 3> 1 by using the Laplace approximation, if 
the prior PDF p(A) does not vary too rapidly. The result is 



V2nV p(A) 



-v 




A(h) 



where A — A(h) is defined by 
£(h) = 2V 



i(h) 2 



(3.15) 



(3.16) 



If we now compare Eqs. (2.16), (3.5), (3.14) and (3.15) and 
use p s o = 1/2 and 1 — p s <C 1, we see that the Bayesian 
threshold A and frequentist threshold A+ are related by 




-v 



,A 2 /2 



T 1 



3. 

2V 



where the factor T is 



T 



AAl 
2Vp(A) ' 



(3.17) 



(3.18) 



Clearly the two thresholds coincide when T = 1. However, 
typically the factor T can be quite significant and can cause 
the Bayesian and frequentist thresholds to differ substantially. 



It is useful to consider a specific example. Suppose that 
we are searching for black hole mergers which we expect to 
produce short (a few ms) broad band signatures with a time- 
frequency volume of V — 100. We want to be 99% sure of 
our detection, so we set p s = 0.99 = 1 — pf d . Suppose that 
the search duration is 1/3 of a year, so that the number of in- 
dependent start time is A4t = 10 10 say, and that the number 
of statistically independent windows is 10 4 . For 99% confi- 
dence that there will be no f alse alarms, we should choo se 
Qo = 10 -16 , from Eq. ( fig ), This gives from Eq. §A$ ) a 
frequentist threshold of £* = 411.3 corresponding to a signal- 
to-noise threshold of A*. = 14.5. The corresponding Bayesian 
threshold depends on the specification of prior PDF for signal 
amplitudes A. A reasonable choice of prior is 



p{A) 



f 3A 3 JA 4 A>A C 
\ A < A c 



(3.19) 



This is just the distribution that would be expected for sources 
distributed uniformly in time and space, except that it is cut- 
off in an approximate way at small A in order to ensure cor- 
rect normalization. The parameter A^ is prior probability per 
start-time of an event being present with signal-to-noise ratio 
exceeding unity. Based on population estimates such Ref. [Q], 
we optimistically assume a prior probability of order unity for 
approximately one merger event per year with A > 1, which 
translates into A^ ~ 10~ 10 . We c an now comp ute a Bayesian 
threshold by combining Eqs. (3.5), (3.12), and (3.13). There- 
suit is £ = 539 corresponding a signal-to-noise ratio threshold 
of A = 18.4, which is substantially higher than the frequentist 
value of 14.5. 



IV. IMPLEMENTATION 



As discussed in Sec. \1A[ one will not know in advance the 
start time t s , duration St and frequency band [f s , f s + Sf] of 
signals in a real search, and thus one must perform a search 
over these four parameters. One needs to compute the excess 
power £r(h) for each possible time-frequency window, and 
record as possible events all of those windows for which £q- 
is above threshold. *** We assume that we wish to search over 



all values of St in the range 

St min < St < 5t m . dx , 

and over all f s and Sf with 



Sfmin < 5f < Sf md 
/min ^ fs 

fs + Sf < f min + Sf 1T 



(4.1) 



(4.2) 



***Note that different thresholds will be required for each win- 
dow T, but the false alarm probability Qo will be the same for each 
window. 



10 



It is clear that the computational cost can quickly grow to un- 
reasonable proportions, so it is important to achieve an effi- 
cient implementation of the search technique. 

There are (at least) two different ways to implement a 
search over a pre-specified set of time-frequency windows. 
The first uses many FFTs of data segments with durations 
in the range (1.1) as suggested by the derivation in Sec. |l| 
and fo r eac h FFT computes £ for all frequency bands in the 
range (4.2). This process is then repeated for every possi- 



ble start time. We call this procedure the short FFT method. 
The second method partitions the time series into long data 
segments each containing M samples, and for each of these 
segments computes its FFT. That FFT is then partitioned into 
8 f mm/ 8 f nun non-overlapping frequency bands each of width 
Sfmin, and for each one the FFT is bandpass filtered to that 
frequency band and then inverse Fourier transformed. The 
result is <5/ max / Sfmin different timeseries, which we call chan- 
nels, each containing particular frequency information. The 
elements of these time series are then squared. One obtains 
in this way a time-frequency plane in which each pixel repre- 
sents the total power in a time-frequency volume of order ~ 1. 
Finally, one computes the total power in the various rectangles 
in this time-frequency plane. We call this procedure the long 
FFT method. We now consider the computational cost of each 
method in turn and argue that the second method is more com- 
putationally efficient. 

A. Sufficiency of approximate version of statistic 



In Sec. [II A| above we discussed how to compute an approx- 
imate version of the excess power statistic [cf. Eq. (2.14)]. 
Namely, for a given start time t s , perform a Fourier transform 
of the K point time series {hj} corresponding to the time 
window St, where K = St/ At. Denote the discrete Fourier 
Transform (DFT) by hk where < k < K/2. Identify the 
frequency components k\ < k < fc 2 of hk which belong to 
the frequency band Sf, and construct the statistic 



* = * E 



\h k \ 2 /S k , 



(4.3) 



where S k is the noise power spectrum defined in Eq. ( |2.13[ ). 

The quantity (4.3) differs from the exact statistic £ due to 
the fact that the expectation value (hkht,) is not diagonal. (It 
becomes effectively diago nal o nly in the limit St — > oo.) Con- 
sequently, the expression (4.3) is not a sum of squares of in- 
dependent unit-variance Gaussian random variables, and so 
its distribution could in principle differ from the non-central 
X 2 distribution. However, in practice, if the power spectrum 
of the noise is a slowly varying function of frequen cy, then 
the correlations introduced by using the expression (4.3) are 



small. T o con firm this, we have examined the behavior of the 
statistic (4.3) computed from the DFT of colored, Gaussian 
noise. We generated colored noise according to the correla- 
tion generating scheme 

rij = {nij - 0.8toj_i + 1.2944 7ij_i - 0.64n j _ 2 )/1.3145 

(4.4) 



where m,j are uncorrected Gaussian deviates and rij = rrij = 
for j < 0. To determine detection statistics, we used the 
signal model 



Sj = Sexp[-16(j/2JV- l) 2 ] cos(27rj/o) 



(4.5) 



with N = 4096 samples in the signal. The central frequency 
of the signal was /n = 600/4096 and the constant S was cho- 
sen to give the required value of signal amplitude A. We found 
the operating characteristics of £ were not significantly af- 



fected by using the approximate formula (4.3) rather than the 
exact formula. This is demonstrated in Figs7|5 and || where we 
have overlaid simulated false alarm and true detecti on pr oba- 
bilities on top of the distributions computed in Sec. PH. We 
calculated the goodness of fit using a \ 2 test for a few of the 
curves in these figures. In each case, the reduced x 2 value 
was < 1.03, indicating that it is unlikely that the simulated 
data is draw n from distributions other than those presented 
in Sec IIB. We the refo re conclude that we can use the ap- 



proximate formula (4.3) without significantly modifying the 
behavior of the statistic £. 



B. The short FFT method 

The algorithm is: (1) pick a start time, (2) pick a time du- 
ration St, (3) FFT the selected data and compute the power 
in each frequency bin, (4) sum the power in the bands of in- 
terest, (5) loop over steps (2)-(4) until all time durations are 
used, and (6) repeat steps (l)-(5) for all start times. 

The computational cost for steps (3)-(4) can be estimated 
as follows. The number of data points in segment of data of 
duration St is TV = St/ At, so each FFT requires 3_/Vlog 2 N 
floating point operations. Since the relevant frequency band 
has a dimensionless bandwidth^ a max = <5/ max At, the total 
cost to compute the power in each frequency bin in the band 
is 3-/Va max operations. Now, for the fcth frequency bin, it costs 
(Na m nx — k) floating point operations to compute the power 
in all frequency intervals whose have lowest frequency com- 
ponent is in the kth bin; the number of operations required to 
do this for all frequencies k is 



JVa max -l 

E 

fc=i 



(-/Va max - k) 



: (iV« max - 1). 



(4.6) 



These steps must be repeated for each TV with 7V m j n < N < 
iV max , where iV min = St min /At and iV raax = <5t max /At. Thus, 
the total computational cost per start time C s hort is 



jv„, 



t (Na 



N=N m 



"I)]- 

(4.7) 



"^By dimensionless bandwidth we mean number of frequency 
bins, i.e., bandwidth multiplied by At. Note that the dimensionless 
bandwidth of the entire frequency band up to the Nyquist frequency 
is 1/2. 



11 



One will typically have N m i n ~ 1 and A^ max o; max = V » 1, 
where V is the total time-frequency volume to be searched. 
In this case, a useful approximation to the computational cost 
per start time is 



Cshort 



V 2 



2a n 



31og 2 V 



On 



(4.8) 



The total computational cost in flops (floating-point opera- 
tions per second) can be obtained by multiplying C s hort by the 
sampling rate. 



C. The long FFT method 

As discussed above, in the long FFT method one constructs 
a time-frequency plane consisting of iV channe i s = (5/ ma x/<5/min 
different channels. The power in any time-frequency window 
can then be computed by summing the power in that region 
of the time-frequency plane. The data stream is broken into 
chunks of length M points, each chunk is FFTed, and the req- 
uisite number of channels are produced by bandpass filtering, 
Fourier transforming back into the time domain, and squaring 
the time samples. For each chunk, the computational cost of 
this step is 



Ci = M [3(1 + A cha „nels) log 2 M + ^channels]- 



(4.9) 



To search over the time-frequency plane, we first pick the 
frequency interval Sf and construct the 5f mlLX /Sf channels of 
this bandwidth; this requires M(Sf / f m i n — 1) additions. For 
each of these new channels, we sum up the power in various 
time intervals. This step requires A max operations per start 
time, of which there are M — jV m j n , Thus the total cost at this 
stage of the search is 



Co 



N\ channels 

E 



^=5* [My - 1) + N m . dX (M - AW)]- 
] (4.10) 



Since there are approximately M different start times, the cost 
per start time is given by the approximate formula 



c 



long 



(4.11) 



D. Comparison of the two methods 

The space of time-frequency windows to search over was 
delineated in the Introduction for the initial interferometers in 
LIGO. We adopt the corresponding parameter values <5/ m i n = 
2Hz, Sf mRX = 200Hz, St mia = 0.005s, and<5i max = 0.5s. The 
computational power required using the long FFT method is 
0.3 GFlops, which saves a factor of ~ 14 over the short FFT 
method if At = 0.001 seconds. 

In general, the computational gain afforded by the long FFT 
method over the short FFT method is given approximately by 



Cshort 
Cong 



1 



21n2, 



1 V 



(4.12) 



The first term shows that there is at least a factor of ~ 4 to 
be gained by the long FFT method; in addition, the computa- 
tional gain increases with the total time-frequency volume V. 
For V = 100, the second term is also ~ 4. 

There is a further benefit to the long FFT technique. It al- 
lows finer frequency resolution in the choice of starts f s and 
ends f s + Sf s of the frequency bands to be explored (although 
the above estimates of computational cost were for a search 
equivalent to the short FFT search). Moreover, as part of a 
hierarchical search, the long FFT method has a further advan- 
tage in that it allows follow ups to be made without significant 
further computations. The next stage of a hierarchical search 
might involve techniques other than the excess-power method, 
e.g., Hough transforms or other line tracking algorithms. 



V. MULTIPLE DETECTORS 

The network of gravitational wave detectors under con- 
struction around the world brings benefits that a single instru- 
ment cannot. This is especially true for "blind" search tech- 
niques, such as the power statistic. Since these techniques do 
not require the signal to have a specific form, random noise 
glitches are much more likely to meet the detection criteria 
than is the case for signal-specific searches such as matched 
filtering. Multiple-detector statistics will be much more effi- 
cient at rejecting such false alarms than single-detector statis- 
tics [ p8|p^ ]. In this section we consider the construction of 
the optimal detection strategy for a network of detectors. The 
derivation requires further formal development. For maxi- 



mum clarity, we introduce most of our notation in Sec. V A 



We derive the multi-instrument detection statistic for a net- 
work of aligned detectors in Sec. 



VB 



The two LIGO inter- 
ferometers at the Hanford site form such a network. In ad- 
dition, if we ignore the slight misalignment that arises from 
curvature of the earth, we can also include the interferome- 
ter in Livingston to form a three interferometer network. The 
gene ral c ase when not all instruments are aligned is treated in 



Sec. VC 



Our analysis is based on the formalism of Ref. [[30|] which 
followed earlier work of Ref. [[[o]]. We assume that the noise 
of the detector network is Gaussian. Even though we allow 
correlations between noise in different instruments, the as- 
sumption of Gaussian noise is a serious limitation since the 
main benefit of having several detectors is to combat non- 
Gaussian noise. It should be possible to adapt the theoretical 
models of non-Gaussian noise given in Ref. [[39|] in order to 
derive robust multi-detector statistics. However, it is neces- 
sary to understand first the Gaussian case. 



A. Notation and terminology 

Suppose the detector network consists of rid detectors. De- 
note the output of the entire network by the vector of time 
series 
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= {h 1 (0),h 1 (M),...,h 1 [(N-l)M], 

h 2 (0),... ,h\N-l)At],... , h nd [(N — l)Ai]} (5.1) 

where A G {1,2, ... , n d } and j G {0, 1, . . . , N - 1}. We 
assume that the noise n of the network follows a multivari- 
ate Gaussian distribution which is determined by the (Nnd) 2 
correlations 



(n A (iAt)n B (jAt)) = C AB (\i-j\At) = R AB 



(5.2) 



where (•) denotes an ensemble average. In general, the ele- 
ments of the network correlation matrix are 



B. Aligned detectors 

The simplest type of multi-instrument network to analyze is 
a network consisting of instruments which all respond to the 
same polarization component of the gravitational wave field. 
The two LIGO interferometers in Hanford form such a detec- 
tor, and if we ignore the slight misalignment arising from the 
curvature of the earth (a ~ 10% correction effect; see Table 
|b|) the third LIGO interferometer in Livingston can also be 
included. 

The signal at any detector is simply a time-delayed version 
of the signal that would be detected at the coordinate origin, 
which for simplicity we take to be at the center of the earth. 
Thus, the signal at detector A is 



(R)AN+i,BN+j — R 



All 



(5.3) 



The convention ( 5.3 ) for combining the capital and lower case 
indices to form in an Nnd X Nnd matrix is used from here 
on. The probability density function of the noise is given by 



p(fi) = [(2ir) Nn « dBtR]- 1 / a exp[-§(fl > a)] 



where the inner product is given by 

n d N-l 



(P,q)= E Ep 



Ar\AB B 

ij "j 



A,B=1 i,j=0 



and 



(5.4) 



(5.5) 



\t) = 3 (t + T A ), 



(5.10) 



where ta is the time of flight for a gravitational wave between 
detector A and the coordinate origin, and s(t) is the signal at 
the coordinate origin. The time delays ta depend on the direc- 
tion to the source: if m is a unit three vector in the direction 
of propagation of the gravitational wave (i.e., opposite to the 
direction to the source) and x A is the location of detector A, 
then 



ta = m • x A /c 



(5.11) 



where c is the speed of light. Finally, the DFT of the signal at 
detector A can be written as 



~A _ 2irikA A {m)/N ~ 



(5.12) 



4? = (R- 1 ) 



AN+i,BN+j ■ 



(5.6) 



For later convenience, we note that the inner product ( 5.5 ) can 
be written in terms of the discrete Fourier transforms of the 
detector time series, that is 



(p>q) 



E 

A,B=1 



LJV/2J 
k=0 



UAB-B 
/ j fk \"h / Ik 



(5.7) 



where 



N-l 



2 \^ e 2wijk/N R AB e -2wij'k'/N^ 
/ j 33 

j,f=o 



(5.8) 



The notation [N/2\ denotes the greatest integer less than 
or equal to N/2. These relations are strictly speaking valid 
only in the continuum and infinite time limits A< — > and 
NAt — * oo. Nevertheless, they are sufficiently accurate for 
most practical applications. Finally, we note that the likeli- 
hood ratio A(h; s) is given by 



A(h; s) = = exp[(h, s) - i(s, I)]. 

p(h|0) 



(5.9) 



where A^(m) = ta/ At and §k is the DFT of the signal at 
the origin of coordinates. 

A convenient description of the multi-instrument network 
response is the effective strain defined by p0|] 



A,B=1 



(5.13) 



where 



l/givff) — e -2nikA A (m)/N ^g-l^AB e 2nikA B (m)/N 

A,B=1 (5.14) 

We note that the effective strain depends on the direction m 
to the putative source through the time delays A^m). Gen- 
erally the same is true for Si , although not if there are no 
correlations between the instrumental noise at separated sites 
i3C 



+++ We can now write the likelihood ratio (p.9h as 



A(h;s)=e X ppW,s))-|((s,s 



(5.15) 



* M It might be reasonable to make this assumption for the three de- 
tectors at LIGO for example. 
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where 



W2J 

,ff))=4Re£«/ST- 

k=0 



(5.16) 



The posterior probability p s (h) that a signal is present given 
the data h is determined by integrating the likelihood against 
prior probabilities densities for the signal and for the source 
direction m. Thus 



Ps(h) 



A(h) 



PsO 



1 - PsO 



(5.17) 



where 



A(h) 



p(9, cj))d 2 n I p{T)dT 



p{e\T)d N ah(h.;a) 
(5.18) 



and p s o is the a priori probability that any gravitational wave 
sources exist. The mechanism of how information about the 
signals is encod ed in t he prior probabilities p(s\T) and p(T) 
is treated in Sec. HI A , and applies directly to the current con- 
text with only one modification: the inner product (•, •) should 
be replaced by ((•, •)). In particular, the probability distribution 
p(s\T) is given by Eq. (3.2). The function p(0, 4>) is the ex- 
pected distribution of source directions. For sources that are 
mostly further than ~ 30 Mpc, the distribution should be uni- 
form on the sphere. 



The integral in the expression ( 5.18 ) for the likelihood func- 
tion includes a sum over all time-frequency windows T and 
all source directions m. However, it is nearly equivalent, and 
much easier, to adopt the maximum term in the sum as an ap- 
proximation to the likelihood function, since the largest term 
will dominate the sum when a signal is present. It is therefore 
sufficient to consider only a single time-frequency region T 
and fixed direction m in the remainder of this section, with the 
understanding that the detection statistic will include a maxi- 
mization over these variables [[yj. 



Using arguments similar to those in Sec. MB, we can per- 



form the inte gral over signals s. In particular, the prior proba- 
bility in Eq. (3.2) restricts s to lie in a vector space Vr which 
contains only signals in the time-frequency window T. This 
has the effect of replacing s by s || in the inner products. More- 
over, the inner product ((■,■)) induces a natural inner product 
on the subspace Vr- The integral over angles can be per- 
formed as in Eq. (3.13) to show that 



A(h) = / T(V)e~ A /2 (A£ 1/2 /2) 1 - y I v _ 1 (A£ 1/2 )p(A)dA 

(5.19) 



where 



(eff) 



k x <k<k 2 



(5.20) 



Here h^u is the projection of the effective strain into the time- 
frequency subspace Vr ; the second equality holds for a partic- 
ular time-frequency window in which the signal has duration 



NAt and is localized to the frequency band fcj < k < k^- 
The amplitude A is define d by the A 2 = ((sn , sn )) . Since 



the right hand side of Eq. (5.19) is a monotonically increas- 
ing function of £, the Neyman-Pearson theorem tells us that £ 
provides the optimal multi-instrument detection statistic. Note 
that, as mentioned above, this detection statistic includes an 
implicit maximization over all source directions m and time- 
frequency windows T. 



C. General networks of detectors 

When the network contains at least one instrument with a 
different orientation to the others, it is necessary to discuss 
the two degrees of freedom, or polarizations, of the gravita- 
tional wave signal. We denote these two independent signals 
as s + (t) and s x (i), where the definition is with respect to a 
radiation coordinate system associated with the gravitational 
waves. (See Appendix |b] for a detailed discussion of these and 
the other coordinate systems relevant to this section.) For the 
Ath detector in the network, the gravitational wave strain is 



\t)=F A s+(t + T A ) + F^(t- 



(5.21) 



where Fjj 4 , F A are the detector beam pattern functions and 
ta = m • x^/c is the time delay between the origin of earth- 
fixed coordinates and the detector A (located at x" 4 ) for a wave 
propagating in the direction m. 

The concept of effective strain is particularly useful in the 
formal development of the multi-instrument detection statistic 
for gravitational waves. To introduce this concept, consider 
the inner product 



(s, s) = 



E 

A,B=1 



4Re 



LAT/2J 

E s t(s fe - 

k=0 



AB-B- 



The DFT of the signal at the ^4th detector is 



7.A 



= e 2«A A (m)fc/JV (F A-+ + pAgX) 



(5.22) 



(5.23) 



where A^(m) = ta/ 'At is the discrete time delay, At is the 
sampling rate, and s + x are the DFTs of the plus and cross 
polarizations of the signal defined in Eq. (5.21). The inner 
product can be rewritten as 

V N / 2 \ 

(s,s)= J2 4Re E s ~* e W* 

a,/3=+,x fc=0 (5.24) 

where 



0*9= £ e 2Trt(A A -A B )k/N F A is -yB^ 



A,B = 1 



(5.25) 



We can now introduce a pair of effective strains, correspond- 
ing to the + and x gravitational wave signals for the network, 
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by 
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K 



rid 

E < 

/3=+,X A,B 



E 



-2niA A k/N (S -1 )' 4B /l B 

(5.26) 



where X}/3=+ x ^t^^p-y ~ ^7 ■ ^ n terms of the effective 
strains h? and the signals s + x , the likelihood ratio is 



A(h;s)=exp[((M))-ip,g))] 
where the inner product is now defined as 

L7V/2J 



(5.27) 



(5.28) 



The effective strains and the inner product ( 5.28 ) depend on 
the direction m to the putative source. Consequently the prob- 
ability that a signal with plus polarization s + and cross polar- 



izations* is present in the data stream is given by Eqs. (5.17) 



and ( |5.18[ ) with A(h; s) defined by Eq. ( |5.27[ ) where the mea- 
sure p(s\T) on the space of signals is defined as follows. 

The signal {,s + , s x } now belongs to a 4F-dimensional vec- 
tor space which is the tensor product of two copies of V-r- 
Within this vector space, all directions can also be consid- 
ered to be equally likely. These assumptio ns ab out the signal 
{s+, s x } reduce to the assumption in Sec. V B when the de- 
tectors are aligned. Moreover, the reasoning from that section 
can be readily applied here provided one understands that the 
vector space of signals is now ^-dimensional and that all an - 
gles and lengths are measured using the inner product (5.24). 
Thus, the integrals can be carried out in much the same way 
to arrive at the excess power statistic for a multiple-instrument 
network: 



£ = ((h h k)) 



E 

Q,/3=+,X 



4Re 



E 

ki </c</C2 



h a pl k h 

rL k (J af3 n k 



f3* 



(5.29) 



As before there is an implicit maximization over time- 
frequency windows and source directions. 

Since the effective strains are linear combinations of the 
outpu ts of each of the detectors in the network, the statistic 
( 5.29| ) is a bilinear function of the outputs of all the detectors, 
containing both auto-correlation terms from each detector in- 
dividually and cross-correlation terms between each pair of 
detectors. It is the optimal statistic in Gaussian noise. When 
the noise in the instruments is non-Gaussian, it remains to 
be seen what is the best strategy. One obvious strate gy is 
to simply omit the auto-correlation terms in Eq. (5.29) and 
retain only the cross-correlation terms; the resulting statistic 
will share many of the nice features of £ and be more robust 
against non-Gaussian noise bursts. The real challenge is to de- 
rive the optimal statistic in the presence of uncorrelated noise 
bursts which are Poisson distributed in time. It is likely that 
the model introduced in Ref. \Q% can be used to address this 
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APPENDIX A: RELATED DETECTION STATISTICS 

In this appendix we discuss some detection statistics that 
can be obtained from the Bayesian formalism discussed in 
Sec. |ni| starting from different prior PDFs for signals s. 



1. Known signal spectrum 

Suppose that one knows, in addition to the duration and fre- 
quency band, the spectrum of the expected signal, but that one 
does not know the phase evolution. Let us adopt the Fourier 
basis (assuming that the autocorrelation matrix is reasonably 
close to diagonal in this basis) and assume that the noise is 
stationary and Gaussian. Then the likelihood ratio is 



A(h; s) = exp 



LJV/2J 

4 E (l^-fc||Sfe| cos( 

k=0 



i|S fe | 2 



)/s k 



<A1) 



where <fik represents the relative phase difference between the 
data and the expected signal in the kth frequency bin. Since 
the signal phases are considered unknown, we should inte- 
grate out these angles to obtain the integrated likelihood ratio 



[N/2] 



A(h;{P k },A) = U 2ne 



-A 2 P k /S k 



J (2 3/2 AP, 



1/2 



k=0 



hk\/S k ) 
(A2) 



where the one-sided signal spectrum is given by ^A 2 Pk = 

|5 fe | 2 with2Ei= / 2J ^/^ = l- 

In the limit of weak signal amplitude A, we can approxi- 
mate the likelihood ratio of Eq. ( A2 ) by its expansion in pow- 
ers of A. The first non-trivial term is the locally optimal [ p4| ] 
detection statistic 



d 2 lnA(h; {P k },A) 



dA 2 



= (const) + 4 



LJV/2J 

E 

fc=0 



Pk\hk\ 2 /S 2 k . 
(A3) 



This statistic is the weighted average of the detector output 
power in each frequency bin. Unfortunately, it is not pos- 
sible to get simple expressions for the false alarm and false 
dismissal probabilities for this statistic; one needs to use nu- 
merical methods to obtain these given a known signal power 
spectrum {Pk}- 
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2. Non-Gaussian noise 

It is unlikely that a gravitational wave detector will produce 
purely stationary and Gaussian noise. In the case that the de- 
tector noise distribution is known, we can obtain a detection 
statistic for unknown signals using the Bayesian methodology. 
Unfortunately the most general noise distribution contains 
many free functions and will not be known in practice. How- 
ever, constructing simple analytic non-Gaussian noise models 
and the associated detection statistics can give us insight into 
what kind of statistics to try out with real detectors. 

One such simple model is as follows. We assume that the 
detector noise is stationary, and, as before, that each frequency 
bin in the Fourier basis is uncorrected. Let make the addi- 
tional assumption that the power in each frequency bin is in- 
dependentally distributed, while the phases of each frequency 
bin are uniform and independent. Then 



L(iV-l)/2j 

p(n)d N n= [] M\n k \ 2 /S k ) 



k=l 



d\n k \ 2 daigfik 
Sk 



2n 



(A4) 



where fk(x) are known non-exponential probability distri- 
bution functions (exponential functions would correspond to 
Gaussian noise). Here we have omitted the DC (and a possible 
Nyquist) component. The likelihood ratio is 



L(JV-l)/2j - 

A(h;s)= [] 

fe=i 

L(JV-l)/2j 

- n 

k=l 



h\ 2 /S k ) 



f k (\h k \ys k ) 

2 Re(h k §t)f k (\h\ 2 /S k ) 
Sk f k (\h k \ 2 /S k ) 



j Re(h k s* k ) V f k '(\ht/S k ) 

V s k J f k (\hl/s k ) 



+ 



\h\ 2 f k (\h k \ 2 /S k ) 



Sk fk(\ht/S k ) 



■o(\~s k n 



(A5) 



We have expanded the likelihood ratio in powers of the (pre- 
sumed small) signal in order to construct the locally optimal 
detection statistic . 

To compute the integrated likelihood function we need to 
integrate over our prior knowledge of signals. Let us suppose 
that we do not know the signal phase evolution; then we can 
integrate over the unknown phases arg s k in each frequency 
bin. We find 



2 L(JV-1)/2J 2 
A(h- {Pk}, A) = 1 + ^- J2 — 
L fe=i 



more slowly in the fcth bin, e.g., fk(x) oc (1 — x/2) -2 , then 
we have (^.(x) = (x — 1)/(1 — x/2) 2 , which increases with x 
up to x = 2, and then decreases for larger values of x. Thus, 
large amounts of excess power in the fcth bin are suppressed. 

When the signal is known to be band-limited to frequency 
bins ki < k < k2, but we have no reason to believe that any 
particular bin in the band will contribute more to the overall 
signal-to-noise ratio than any other bin, then we obtain the 
locally optimal statistic by assuming a uniform weighting of 
the terms in Eq. dAq). Thus the locally optimal statistic is 



£ 9k{\h k \ 2 /S k ). 



(A8) 



k\ <.k<Ck2 



In the case of Gaussian noise this is the excess power statistic; 
for noise models with larger tails, the components of the sum 
are attenuated if they have large power. 



APPENDIX B: MULTIDETECTOR AMPLITUDES 



In Sec. V C, we discussed the detection of burst signals us- 
ing multiple detectors. When the detectors are not aligned, 
one needs the response functions of each detector in the net- 
work to a gravitational wave signal from a given sky posi- 
tion. Here we define reference coordinates to which we re- 
fer each detectors response. Consider a coordinate system 
fixed at the center of the earth. In terms of latitude and longi- 
tude {cp, A}, the coordinate axes are oriented so that the x- 
axis pierces the earth at {000, 000}, the y-axis pierces the 
earth at {000,090°E}, and the z-axis pierces the earth at 
{090°N, 000}. We denote the location of a source on the ce- 
lestial sphere by standard spherical polar coordinates {9, 4>} 
measured with respect to this earth fixed frame. A fidu- 
cial signal s comes from a sky position with right ascension 
a = (j> + GMST (GMST is the Greenwich mean sidereal time 
of arrival of the signal), declination 5 = tt/2 — 9, and has 
polarization angle %jj — the angle (counter-clockwise about the 
direction of propagation) from the line of nodes to the X-axis 
of the signal coordinates. In particular, this gravitational wave 
signal can be represented by a tensor 



; (e x )i 



where the polarization tensors are given by 

(e+),j = (X®X-Y®Y) y 
(e x )y = (X® Y + Y®X)ij. 



(Bl) 



(B2) 
(B3) 



S k 



gk(\h k \ 2 /S k ) + 0(A 4 ) 
(A6) 



where \A 2 P k 



Isfcl 2 and 



9k{x) = [xf k (x)+ f' k (x)]/f k {x). 



(A7) 



For Gaussian noise, f k (x) — e~ x and g k (x) = x — 1 for 
all k, which gives essentially the same detection statistic as in 
Eq. ( A3 ). For a probability distribution with tails that decrease 



The vectors X and Y are the axes of the wave-frame, given 
explicitly by 

X = (sin <fi cos rp — sin ip cos <f> cos 6) i 

— (cos 4> cos ijj + sin ip sin 4> cos 9) j + sin ip sin 9 k (B4) 
Y = (— sin (j> sin ip — cos ip cos <f> cos 9) i 

+ (cos (f> sin ip — cos ip sin <f> cos 9) j + sin 9 cos ijj k (B5) 

where the polarization angle ip is defined above, and i, j and k 
are unit vectors along the x, y and z-axes respectively. Note, 
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we use a right handed coordinate system in which the vector 
Z = XAY points in the direction from the source towards the 
detector. The waveforms in Refs. are referred to these 

coordinates; Thorne uses a different definition in Ref. M\. 

One can characterize the response of an interferometer on 
the surface of the earth to the impinging gravitational wave 
using another tensor D given by 



D; 



(B6) 



For the two LIGO interferometers, these vectors are provided 
in [44|. For the other interferometers we used the values in 
note (with tilt angles uj — 0), or the values given in 
Ref. [BM (with elevations h = and tilt angles ui = 0). 



where xi x and n y are unit vectors along the x and y arms of 
the interferometer respectively. For a given interferometer A, 
it is now straightforward to compute the response 



s A = 



3 

E 



(B7) 



and to extract the response functions F A x by comparing the 



result with the formula: 

A 



F 



A s + 



F A s> 



(B8) 
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TABLE I. The locations x (the coordinates are in meters) and direction vectors {n x ,n v } for the various interferometers around the world 
based on the the data in j4rj , p^ ] and the ellipsoidal model described in jp4[]. Note that the arm orientations reported by Allen for the two 
LIGO interferometers do not correctly represent the angles between the northing and the arms at the two sites; we have therefore stated the 
official LIGO arm orientation vectors. For VIRGO, GEO-600, and TAMA-300, see For the first five interferometers, the results are based 
entirely on the numbers reported in Allen. 



Project 



Location 



x ( x Iff m) 



CIT 
MPQ 
IS AS- 100 
TAMA-20 
Glasgow 



Pasadena, CA, USA 
Garching, Germany 
Tokyo, Japan 
Tokyo, Japan 
Glasgow, UK 



{-0.2648, 
{-0.7304, 
{+0.7634, 
{+0.7727, 
{-0.4534, 



-0.4953 
-0.3749 
-0.2277 
-0.2704 
-0.8515 



-0.8274} 
+0.5709} 
+0.6045} 
+0.5744} 
+0.2634} 



{+0.8819 
{+0.2027. 
{+0.1469. 
{-0.1451. 
{+0.6938. 



-0.4715 
+0.9172 
+0.8047 
-0.8056 
-0.5227 



+0.0000} 
-0.3430} 
-0.5752} 
+0.5744} 
-0.4954} 



{-2.490650, 
{+4.167725,- 
{-3.947704, 
{-3.946416, 
{+3.576830, 



-4.658700,+3. 
-0.861577,+4. 
-3.375234,+3. 
-3.365795,+3. 
-0.267688,+5. 



562064} 
734691} 



699409} 
256335} 



TAMA-300 Tokyo, Japan {+0.6490,+0.7608,+0.0000} {-0.4437,+0.3785, -0.8123} {-3.946409,+3.366259,+3.699151} 

GEO-600 Hannover, Germany {-0.6261, -0.5522,+0.5506} {-0.4453,+0.8665, +0.2255} {+3.856310,+0.666599,+5.019641} 

VIRGO Pisa, Italy {-0.7005, +0.2085,+0.6826} {-0.0538,-0.9691, +0.2408} {+4.546374,+0.842990,+4.378577} 

LIGO Hanford, WA, USA {-0.2239,+0.7998,+0.5569} {-0.9140,+0.0261, -0.4049} {-2.161415,-3.834695,+4.600350} 

LIGO Livingston, LA, USA {-0.9546,-0.1416,-0.2622} {+0.2977,-0.4879,-0.8205} {-0.074276,-5.496284,+3.224257} 
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FIG. 1. A flow chart for the short FFT algorithm for the power 
filter. 
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FIG. 2. A flow chart for the long FFT algorithm for the power 
filter. 



20 




0.01 r 



400 
300 
200 
100 




FA-0.01, FD=0.01 

FA=0.0O01, FD=0.01 
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(c) 



10 100 
Degrees of freedom (2V) 



FIG. 3. Cumulative probability functions for (a) the \ distri- 
bution and (b) the non-central \ 2 distribution for various degrees of 
freedom 2V . The curves in (a) give the probability that the power £ 
exceeds a threshold £ * when no signal is present. They can be used 
to fix the threshold given the time-frequency volume V and the de- 
sired false alarm probability. The curves in (b) give the probability of 
detecting a signal whose power is A 2 in the time-frequency volume 
V and given a threshold £ *, where the threshold £* is chosen to give 
a false alarm probability of 0.01. (c) The signal-to-noise ratio A 2 
necessary to achieve a given false alarm probability (FA) and a false 
dismissal probability (FD) of 0.01 for various values of the number 
2V of degrees of freedom. 




6 2 

FIG. 4. The relative effectiveness r\ of the excess power 
method with respect to matched filtering, for the case where the 
time-frequency window is known in advance, as a function of the 
effective number of independent templates A4ff characterizing the 
space of signals being sought, and the time-frequency volume V. 
A false alarm probability corresponding to one false alarm per one 
hundred days of observation (taken to have 1.728 x 10 9 independent 
arrival times), and a correct detection probability of 0.99 were as- 
sumed. The quantity r\ is the ratio of the minimum signal amplitude 
that is required to achieve these false alarm and dismissal probabili- 
ties for the excess power method, to the corresponding minimum am- 
plitude for the matched filtering method. The loss in event rate due 
employing the excess power method rather than matc hed fi lte ring i s 



Tj . T his p lot c an be generated by combining Eqs. (2.24), (2.29) 



(2.30) and (2.31 ) of the text 
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FIG. 5. The probability P(£ > £*|0) of obtaining a value of £ 
greater than a given threshold £ *, for stationary Gaussian noise in the 
absence of a signal, as a function of £*, The quantity £ has a \ 2 dis- 
tribution with 2V degrees of freedom, where V is the time-frequency 
volume (2.5) of the signal being sought. The lines show theo retical 
curves generated according to the standard \ 2 formula (2.15). The 



data points repres ent va lues obtained from Monte Carlo simulations 
(described in Sec. [V Al, where the noise is Gaussian and white (cir- 



cles) or colored (triangles). 




100 



FIG. 6. The probability P(£ > £*\A) of obtaining a value of £ 
greater than or equal to a given threshold £* given the presence of a 
signal of power A 2 and stationary Gaussian noise, for several differ- 
ent values of the number 2V of degrees of freedom, as a function of 
A 2 . For each V, the value of £* which gives a false alarm probability 
of Qo = 0.01 is used. The lines show theoretical curve s generated 



according to the standard non-central \ formula (2.23). The data 



points represen t valu es obtained from Monte Carlo simulations [de- 
scribed in Sec. 



(triangles). 



IV A], where the noise is white (circles) or colored 
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